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The strange quark mass is extracted from a finite energy sum rule (FESR) analysis of the flavor- 
breaking difference of light-light and light-strange quark vector-plus-axial-vector correlators, using 
' spectral functions determined from hadronic r decay data. We point out problems for existing FESR 

, treatments associated with potentially slow convergence of the perturbative series for the mass- 

' dependent terms in the OPE over certain parts of the FESR contour, and show how to construct 

^S) I alternate weight choices which not only cure this problem, but also (1) considerably improve the 

convergence of the integrated perturbative series, (2) strongly suppress contributions from the region 
(j^ ' of s values where the errors on the strange current spectral function are still large and (3) essentially 

, completely remove uncertainties associated with the subtraction of longitudinal contributions to the 

experimental decay distributions. The result is an extraction of nis with statistical errors comparable 
to those associated with the current experimental uncertainties in the determination of the CKM 
angle, Vus. We find ms{l GeV) = 158.6± 18.7± 16.3± 13.3 MeV (where the first error is statistical, 
^ ' the second due to that on Vus, and the third theoretical). 
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O ■ I- INTRODUCTION 

o , 

The light quark masses, m^., + rud, are among the least well determined of the fundamental parameters of the 
Q-i. Standard Model and, as such, have been the subject of much recent attention, in both the QCD sum rule [1-17] and 
lattice [18-21] communities. 

\ Recent attempts to extract m„ -f ma and uris via sum rule analyses of, in the former case, the light quark [ud) 

■ pseudoscalar correlator [1], and in the latter case, the light-strange {us) scalar [2,3,5,9] or pseudoscalar [8] correlators, 
^ ' suffer from the problem that the relevant spectral functions are not fully determined experimentally in the region 

•rH ■ required for the analyses. 

■ Analyses based on vector current correlators involving various pieces of the light quark electromagnetic (EM) current 
suffer from analogous problems. In the case of Narison's sum rule based on the difference of the flavor 33 (isovector) 
and 88 (hypercharge, or isoscalar) correlators [4], the G-parity-based identification of the 33 and 88 contributions to the 
EM hadroproduction cross-section, which would allow the difference of 33 and 88 spectral functions to be determined 
from experimental data, is valid only in the absence of isospin breaking (IB). The high degree of cancellation (to the 
level of 10— 15%) between the 33 and 88 spectral integrals makes the analysis rather sensitive to the neglect of IB [7]. 
This sensitivity is compounded by the fact that a sum rule determination of the corrections required to remove the 
38 contributions from the experimental data shows that, for reasons which are easily understood [7], the dominant 
corrections, associated with the lo contribution to the nominal 88 spectral function [7,22], are larger than one would 
naively expect. "'^ The necessity of determining the IB corrections theoretically thus prevents one from working with a 
sum rule whose spectral side is determined solely by experimental data. 

A similar problem exists for the sum rule based on the difference of 33 and ss vector current correlators [16], since 
the portion of the EM hadroproduction cross-section associated with the ss part of the EM spectral function is not 
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^The central value rusil GeV) = 176 MeV [16], obtained neglecting IB corrections, is reduced to 146 MeV when one applies 
the IB corrections obtained in the sum rule analysis of Ref. [22]. 
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an experimental observable. In Ref. [16], it is assumed to be given by the cross-section for the production of the 
various (j) resonances. This approximation, while no doubt a reasonable one, is exactly valid only if both (1) the Zweig 
rule is 100% satisfied and (2) the (j) resonances are all pure flavor ss states. The close cancellation (to the ^ 15% 
level) between the 33 and ss spectral integrals again makes the analysis sensitive to even small (few %) Zweig rule 
violations (ZRV). To illustrate this sensitivity, let us take the deviation from ideal mixing in the vector meson sector 
as a measure of the natural scale of ZRV, ^ and consider a scenario in which ZRV occurs dominantly in the mass 
matrix and not in the vacuum-to- vector-meson matrix elements of the vector currents. The strange (light) quark part 
of the EM current then couples only to the strange (light) part of any given resonance. If the flavor content of a given 
(j) resonance is ass -I- P{uu + dd)/V^ (with a ~ 1 and /3 small), the ratio of the square of the full EM decay constant 
to that of the decay constant describing the coupling only to the ss part of the EM current is then 0:^1 — For 
either the linear or quadratic versions of mixing this ratio is less than 1; including ZRV corrections will thus increase 
the ss spectral fimction and hence lower the extracted value of m^. Taking, to be specific, the case that the radius of 
the circular part of the FESR contour is (1.6 GeV)^, we find that, using an identical method of analysis and identical 
higher dimensional condensate values to those employed in Ref. [16] (and including, for completeness, the small IB 
isovcctor contribution to the 0(1020) EM decay constant determined in Ref. [22]), the central value of nis{l GeV) 
obtained ignoring IB and ZRV [16] (196 MeV) is lowered to 177 MeV (108 MeV) for the linear (quadratic) cases, 
respectively. We stress that the point of this exercise is not to attempt a realistic estimate of ZRV corrections but 
rather to point out that, given the scale at which such violations arc already known to occiir, the uncertainties in the 
extraction of associated with the neglect of ZRV are large, and, moreover, cannot be significantly reduced without 
a major improvement in our theoretical understanding of the precise nature and magnitude of ZRV. ' 

In light of the fact that, in each of the analyses above, it is not possible to work with sum rules for which the 
hadronic spectral function is determined entirely by experimental data, we will, in this paper, instead construct finite 
energy sum rules (FESR's) based on the flavor-breaking difference between the sum of the ud vector and axial vector 
correlators and the corresponding sum of us correlators, for which, up to s = m^, the spectral function can be taken 
from experimental hadronic r decay data [23,15]. The rest of the paper is organized as follows. In Section II we 
provide a brief review, and discuss the practical difficulties to be overcome in arriving at a reliable implementation 
of this approach. In Section III we describe a construction which leads to FESR's which successfully overcome these 
difficulties, and in Section IV we give numerical details and discuss our results. 



II. FLAVOR-BREAKING SUM RULES INVOLVING HADRONIC r DECAY DATA 

For a general correlator, n(s), with a cut beginning at s = Sth and running along the timelike real axis, one obtains 
from Cauchy's theorem, defining the spectral function, as usual, by p = Imll/Tr, the general FESR relation 



so _i r 

dsp{s)w{s) = f dsll{s)'w{s) (1) 

where w{s) is any function analytic in the region of the contour, C, consisting of the union of the circle of radius Sq 

in the complex s-plane and the lines above and below the physical cut, running from sth to sq. 

As is well known, the ratios of ud and us inclusive hadronic r decay widths to the r electronic decay width, 

ij ^ T[t- ^ i^^ hadronsjj (7)] 
^ - r[r- ^ Ure-i>em ' ^ ' 

where (7) indicates additional photons or lepton pairs, and ij = ud, us labels the flavors of the relevant portion of 
the hadronic weak current, can be expressed as weighted integrals over the relevant spectral functions. Eq (1) then 
allows these ratios to be recast into a form appropriate for the use of techniques based on the OPE and perturbative 



^From Ref. [26] one has that the vector meson mixing angle is either 36° or 39°, depending on whether one uses the linear or 
quadratic mass formula. 

^In Ref. [16], the agreement of the 33-88 and 33-ss determinations of nis obtained ignoring IB and ZRV, respectively, was 
taken as evidence against the size of the IB corrections obtained in Ref. [22]. Note, however, that (1) within errors, the latter 
result is compatible with either the IB-corrected or uncorrected 33-88 determination, and (2) two inverse moment sum rule 
determinations of the 6"* order chiral low-energy constant, Q, one based on the 33-88 [24], and one on the sm-33 correlator 
difference [25], are brought into almost perfect agreement once the IB corrections of Ref. [22] are applied to the former analysis. 
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QCD [27-31]. Letting J^j.yj^ be the usual vector and axial vector currents with flavor content ij, and defining the 
scalar J = 0, 1 parts of the corresponding correlators by 



(3) 
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EW\Vi3\ - r ™2 
|s|=mj '"'r 

nl^Aj Pij \s) are the corresponding spectral functions, Sew = 1-0194 represents the leading 



and 



where = 

electroweak corrections [32], and Vij are the usual CKM matrix elements. Since ^ 3 GeV^, the second expression 
in Eq. (4) is amenable to evaluation using the OPE. Dividing both the hadronic and OPE expressions by [l^j]^, 
taking the difference of the ij = ud and us cases, one arrives at a flavor-breaking FESR 

= 2^ / l_i (^L+T(2/)An(o+i)(s) + WL{y)AIl^°\s)) 

where y = s/m^, AH^-^^ = 11^'^' — Hu) , Ap^-^^ = p^^^ — p{fs\ and wl+T) wl refer to the longitudinal-plus-transverse 
((J = 0) + (J = 1), or "L + T") and "longitudinal" ((J = 0)) kinematic weights WL+riy) = (1 - yf (1 + 2y) and 
WL{y) = — 2y (1 — y)^ , respectively. The mass-independent {D = 0) piece of the correlator difference AII^"') on the 
OPE side of the sum rule Eq. (5) of course vanishes by construction. In the limit that we neglect ^ and asmu^ms 



(5) 



relative to mnz 



moreover, the D = 2 terms in the OPE representation of Hy^j^.^j become simply proportional to nig. 
Were the OPE representations of both the L + T and longitudinal contributions above to be well converged at scale 
m^, Eq. (5) would thus allow a determination of TOs in terms of the difference of experimental non-strange and strange 
decay number distributions. 

The perturbative series for the integrated D 



convergent at the scale sq 



2 longitudinal contribution in Eq. (5), however, turns out not to be 
[11,12], creating a serious problem for the analysis in the absence of an experimental 
separation of transverse and longitudinal spectral contributions. This separation is straightforward at low s but 
experimentally problematic above 1 GeV^.'* Our inability to treat the OPE representation of the longitudinal 
contributions in a reliable manner thus creates difficult-to-quantify uncertainties for any FESR involving signiflcant 
longitudinal spectral contributions. Existing analyses are included in this category since, for example, the central 
value for the difference of non-strange and strange spectral integrals from the analysis of Refs. [13,15], 



I 00 



IK. 



= 0.394 ±0.137, 



(6) 



corresponds to L + T, longitudinal and higher dimension condensate contributions which are 0.184, 0.155 and 0.055, 
respectively. 

Another practical problem is the close cancellation between the rescaled us and ud spectral integrals for the sum rules 
above, based on the kinematic weights, wl+t and wl- In the analysis of Refs. [13,15], for example, the cancellation is 



*In Ref. [11], an attempt was made to circumvent this problem by assuming the validity, even in the region of non-convergence, 
of a relation between the integrated longitudinal OPE vector and axial vector D = 2 contributions valid in the region of 
convergence of the OPE representations of both. If true, this would allow the longitudinal strange axial integral to be obtained 
from the longitudinal strange vector integral. The latter can be obtained using the model strange scalar spectral function 
of Ref. [5]. Using appropriately- weighted FESR's for the strange pseudoscalar channel, we have now been able to test this 
assumption, and demonstrate that it is, in fact, incorrect. 
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to the ^ 10% level, making the results very sensitive to both small variations in the input parameters and the sizeable 
experimental errors 20 — 30%) on the strange decay number distribution above the K* region. Two features of the 
analysis of Rofs. [13,15] illustrate the former sensitivity. First, Refs. [13,15] employ \Vus \ = 0.2218 ± 0.0016, c.f. the 
PDG98 [26] value 0.2196 ± 0.0023. Though compatible within errors, the squares of the two central values differ by 
~ 2%; use of the PDG98 value decreases the flavor-breaking difference, A°°, by 17%. Since one cannot reliably employ 
the OPE representation of the longitudinal contributions, moreover, the longitudinal spectral contribution (which is 
dominated, at the ~ 80% level, by the K pole term) must be subtracted; the shift in the inferred L + T contribution 
(used to determine m^) is thus even larger (36%). Similarly, use of the PDG98 value fx = 113.0 ± 1.0 MeV in place 
of the ALEPH determination, fx = 111.5 ± 2.5 MeV lowers the inferred L + T contribution to A"" by a further 
12%. The combined impact on the central value for is thus extremely large, though the two central values are, 
of course, compatible within the (large) errors quoted in Refs. [13,15]. The relative size of the residual statistical 
errors as a fraction of the resulting A^" is, of course, also significantly increased by such a decrease in A°°. It is 
thus highly desirable to choose, in place of the kinematic weights, weights which produce a less close cancellation 
between the ud and us spectral integrals. The easiest way to accomplish this goal is to choose weight functions which 
fall off more rapidly through the region of the excited strange resonances. This has the happy consequence of also 
suppressing contributions from the region where both the errors on the strange spectral distribution are large and the 
transverse/longitudinal separation is experimentally difficult. 

The final difficulty to be dealt with is theoretical. Suppose we are able to solve the longitudinal/ transverse separation 
problem, and thus work with FESR's involving only the L + T part of the flavour breaking difference. 



(1+0) 
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The leading {D = 2) m^-dependent terms in the OPE representation of IT are [10] 



3 m2(Q2) 



l + ^a(Q') + (19.9332)a(02)2 



2tt Q2 



(8) 



k=0 



with a{Q'^) = as{Q'^)/'iT and ms{Q'^) the running coupling and running strange quark mass, both at scale /i^ = = 
—s, in the MS scheme. The ratio of 0{a) and 0{a'^) coefficients in Eq. (8) is rather large (8.5), signalling potentially 
slow convergence (with as{m'^) = 0.334 [23], the ratio of the 0{a^) and 0{a) terms is 0.90 at /i^ = m^, and > 1 for /it^ 
below ^ 2.2 GeV.) In recent analyses [13-15], this potential problem is brought under (apparent) control using the 
method of "contour improvement" [30]. In this method, the logarithms in 11 are flrst summed (as has already been 
done in Eq. (8)) by choosing the renormalization scale equal to at each point on the circle |s| = sq- The integrals 



ds 



\s\=so 



m{Qy 



a{Q'^f WL+T{y); 



y 



s/so 



(9) 



are then evaluated numerically, using the known 4-loop forms for the running mass and coupling. The OPE side of the 

L + T part of the conventional r decay sum rule then reduces to a linear combination of the ^J^^+^l (j^r); ^ = 0, 1, 2, 
with the index k giving the "contour-improved order" . Both the convergence and the residual scale dependence of 
the resulting truncated series are significantly improved by this procedure [12,14]. Since, relative to an expansion in 
terms of a(/x^), for some fixed scale /i^, contour improvement represents a resummation of the perturbative series, it 
is possible that this improvement is physically meaningful. 

Unfortunately, it turns out that the apparent improvement is not a general one, but rather the result of an accidental 
suppression of the k = 2 integral. To see this, lot us, for illustrative purposes, imagine that the unknown coefficients. 



gk, for fc > 3, in Eq. (8) grow geometrically, i.e., gk = (19.9332) 



19.9332 



7/3 



k-2 



,fc > 3.5 



We then evaluate A[ ^+^' 



(«o) 



^Note that Refs. [13-15] employ a form of the L + T FESR in which the OPE integral has been partially integrated once in 
order to re-express it in terms of the difference of L + T ud and us Adler fuuctious. The contour-improved series for the Adler 
function version differs term-by-term from that based on the direct correlator difference. Though the agreement of the sums 
of the two versions to second order is excellent, the reader should bear in mind that the relative size of the terms of different 
order is not the same in the two cases. 
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for A; = 0, • • • , 10 and sq = rn^, where w^^rp(y) = ■WL+T{y)[^ — y]^ , N = 0,1, 2, are the "spectral weights" employed 
in the analyses of Refs. [13-15]. The results of this exercise, rescaled in each case by the corresponding fc = value, 
are displayed in Table I. In columns 2-4 we see the apparently favorable convergence of the fc = 0, 1, 2 terms already 
discussed. The results of the remaining columns, however, show that the smallness of the fc = 2 term is not the result 
of a favorable resummation (which would lead also to improved convergence for the remainder of the series) but rather 

[w^ 1 

a consequence of the fact that Ajr^+^'(m2) has a zero as a function of k rather close to fc = 2. The magnitudes of the 
fc > 3 terms arc such that truncation of the series at fc = 2 would produce a significant theoretical error, one much 
larger in magnitude than the size of the k = 2 term.^ The contour improved analysis employing FESR's based on 
the spectral weights thus has potentially significant theoretical uncertainties. 

In light of the problems discussed above for those FESR's based on the spectral weights, w^^j., our goal in the 
next section will be to construct alternate weights which lead to FESR's which bring these problems under control. 



III. THE CONSTRUCTION OF ALTERNATE WEIGHT FUNCTIONS 

We begin our search for an alternate choice of weight function by attempting to understand the source of the 
potential slow convergence of the contour-improved series noted above. The goal will be to find a weight such that, 

even were the unknown g^., k > 3, to grow geometrically, as assumed above, the tail of the contour-improved scries 
would be small relative to the known terms, in contrast to the behavior shown in Table I for the series corresponding to 
the spectral weights, w^+y- If we succeed in doing so, the reliability of the standard approach, in which the truncation 
error is taken to be given by the size of the last known term (in this case, fc = 2), will, of course, be improved regardless 
of the actual behavior of the unknown gk. We will then attempt to simultaneously impose conditions which reduce 
the impact of the experimental errors. 

To study the source of the slow convergence of the contour-improved series, it is useful to consider the behavior of 
the factor }k{Q^) = 'm,{Q'^)^a{Q'^Ygk, appearing in the integrand of gk^lf\so), on the contour |s| = sq. Let w{y), 
y = s/so, be any analytic function real on the real s axis, and = —sq exp(i0) (</> = 0, tt thus correspond to timelike 
and spacelike points, respectively). One then has 

gk (so) = - r #Re [fk{Q^)w (exp(i</.))] . (10) 
Jo 

The behavior of Re(//c) and lm{fk) as a function of </>, for sq = and fc = 0, 10, is shown in Figure 1. We 
observe that both Re(/fc) and Im(/fc) have zeroes on the circle \s\ = m^, and that these zeroes move with the order 
fc. Moreover, while Re(/fc) (slowly) decreases with increasing fc for all angles (f), the magnitude of Im( fk) is sizeable 
in the region (f> > Tr/2 even for fc > 5. This slow convergence in the backwards (spacelike) direction is the origin of 
the slow convergence of the fc > 3 tails of the integrated series shown in Table I, since the factor (1 — y)^^"^ entering 
the weight w^_^_x has maximum modulus at the spacelike point on the contour, and is more and more sharply peaked 
in the backward direction as N increases. In addition, the behavior of Re(/2) and Im(/2) happens to be just such 
that, combined with the changes of sign of the real and imaginary parts of w^_^rp, there is a very strong cancellation 
in the integral over cj) (particularly so for the case N = 0). This strong cancellation is the origin of the "accidental" 
suppression of the magnitude of the fc = 2 term. As we have already seen in Table I, it is potentially dangerous to 
use weights for which the integrals aJ™'(so) are small for a particular fc (or for a small number of values of fc) only 
due to such cancellations. Higher order contributions can then easily be large again, thereby spoiling the seemingly 
good convergence of the first few terms of the contour-improved series. 

The behavior of the Rc(/fe) and lm(/fc) displayed in Figure 1 allows one not only to understand the origin of the 
potential convergence problem but also to construct alternate sum rules which avoid it. From Figure 1 it is evident 
that convergence can be improved by avoiding weights which are large in the spacelike direction. The results of 
Rcf. [33] also indicate that, for the FESR framework to be reliable at scales ^ m^, it is necessary for the weight 
function to have a zero at s = Sq {y = 1)J We have found two approaches useful for implementing these constraints. 



®One should bear in mind that, were one to work with the Adler function version of the L + T FESR, the assumption of 
geometric growth of the coefficients of the Adler function difference is not the same as the assumption of geometric growth of 
the coefficients of the correlator difference itself. The potential convergence problem, however, may also be demonstrated to 
exist in the former case. 

''Such a zero suppresses contributions from the OPE representation in the region near the timelike real axis where, at scales 
~ and below, data shows that it breaks down [33] . 
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The first involves the use of polynomials with "shepherd" zeros, i.e., zeros either on, or near, the regions of the contour 
one wishes to suppress. The second involves the construction of weights, Wp, with Iin(wp) peaked on the contour at 
angles </> < tt/2, thereby avoiding large contributions from Im(/fe), k > 1 (see Figure 1). A convenient and effective 
choice is to take Im(wp) to have a Gaussian form on the contour. Choosing the width of the Gaussian to be 10° 



and the center to be 



good convergence of the fc > 3 tail of the integrated series can be obtained for any 



20° < (j)p< 90°. Technically, these profiles can be well represented using polynomials of degree K k,2Q 



K 



i=0 



The coefficients Oj are determined, upon normalizing Im.{wp) such that ^^(O) = 1, by the Fourier integrals 



(11) 



ao 



d^Im {wp{^)) sm{k(f)), k = 1 . . . K. 



(12) 



To summarize: given the problems discussed above with those FESR's involving the spectral weights, WL+riv)^ 
would like to find, if possible, an alternate weight choice, w{y), 

(1) such that w{y) is strongly suppressed in the region above ,s ^ 1 GcV^, in order to (a) reduce the degree of 
cancellation between the ud and us spectral integrals, (b) reduce the impact of the large experimental errors in the 
us spectral distribution above the K* region, and (c) minimize the role of the longitudinal subtraction which must, 
at present, be performed theoretically; and 

(2) such that w{y) emphasizes those regions of the contour \s\ — sq for which the convergence of the D = 2 series 
is favorable. 

It is, of course, not a priori obvious that there exist 'w{y) having the desired properties. We have, however, 
succeeded in constructing several polynomial weights which do.^ Since, as we will see below, the resulting weights do 
not contain WL+riy) as a factor, the approach is less inclusive than the analysis employing WL+riy) [12,14], but it 
has the advantage of being theoretically cleaner. 

The strategy involving shepherd zeros can be implemented with the zeros either on or off the contour. The first 
weight we have constructed satisfying the criteria above has all zeros on the contour, and is given by 



ww{y) = [i-y]^[i + y?[i + y''][i + y + y^] = i-y-y'' + 2y^-y^-y^ + y 
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(13) 



The absence of 0{y^,y'^) terms, which suppresses £> = 8, 10 contributions, is an additional positive feature of this 
weight. The fourth order zero at y — I and second order zero at y — — 1 provide the desired suppressions of the 
timelike and spacelike regions. An alternate family of weights still having a fourth order zero at y = 1, but with the 
remaining zeros moved off the contour and at a distance r from the origin, is 



u)(r, cos 6*1, cos 6*2,2/) = [1 - 



1 + 



y 



l + 2^cos^i + ^ 



1 + 2^ cos ^2 + ^ 



(14) 



(6*1 and 6*2 give the angular positions of the pairs of off-contour complex conjugate zeros corresponding to the last 
two factors, with respect to the spacclike direction). The choice (r, cos 6*1, cos ^^2) = (1.2,0.5,0.1) produces a second 
solution to the constraints above, one whose biggest coefficient is oi = —4/3. We denote this solution by 



wio{y)=w{1.2,0.5,0.1,y) . 



(15) 



In the approach based on wcnglits which have imaginary parts with a Gaussian profile on the contour, we choose 
a basis of such weights having different centers, 4>p. As noted above, so long as all the 4>p lie in the interval 20° < 
<f>p < 90°, all of the corresponding integrated D = 2 perturbative series will be under control. We then form linear 
combinations of these weights having different <f>p in such a way as to construct a new weight which not only retains this 
good convergence, but at the same time has a zero of sufficiently high order at y = 1 to strongly suppress contributions 



An important further restriction results from the observation that, in the FESR framework, higher dimension contributions 

are suppressed only by inverse powers of so; in order to avoid generating potentially large, and unknown, higher dimension 
contributions, therefore, the coefficients of the polynomials we construct should all be comparable in magnitude to the leading 
coefficient, ao = 1. We have chosen to implement this constraint by keeping all coefficients less than ~ 2 in magnitude. 
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to the spectral integral from the region y > 0.5. The weight of this type which most successfully satisfies the criteria 
discussed above has a rapid high-s falloff produced by a 6*'* order zero at y = 1, a largest coefficient 04 = 2.087, and 
is given by 

W2o{y) = (1 - [l+4.2451y + 9A682y^ + 14.4155j/^ + 16.458^ + 14.6598t/^ 
+10.2818y^ + 5.5567/ + 2.1157/ + 0.3520/ - 0.2065y^° 
-0.21542/11 - 0.1040yi^ - 0.03040^^^ - 0.0045t/i^] . (16) 

The (vastly) improved convergence of the A; > 3 tail of the integrated D = 2 series for the weights wiq, iuio and 1^20 
is displayed in Table II. The entries, as in Table I, have been rescaled by the corresponding k = value, and hence 
correspond to the ratios, gkA'^^\m'^) / Aq"K The results also show that an estimate of the truncation error given by the 
magnitude of the k = 2 term is, for the new weights, almost certainly a very conservative one. We will demonstrate, 
in the next section, that the suppression of the high-s region of the spectrum produced by the new weights is also 
sufficient to significantly reduce the impact of the experimental errors. 

IV. NUMERICAL ANALYSIS AND RESULTS 

In performing the mimcrical analysis of the FESR's constructed above, we employ the ALEPH data for the non- 
strange and strange number distributions^ and PDG98 values for Jk, f-m and \Vus\- As noted above, the 
weights have been chosen in such a way that, although theoretical input is required in order to subtract the longi- 
tudinal contributions to the experimental number distributions, and hence obtain the L + T spectral functions, the 
effect of this subtraction on the final value of rus is negligible. We will quantify this statement below. Once the 
L + T spectral function has been determined, it is a straightforward matter to evaluate the weighted L + T spectral 
integrals. The choice of steeply falling weights ensures that the strange spectral integrals are dominated by the K and 
K* contributions, for which the experimental errors are much smaller than those of the rest of the strange number 
distribution. This plays a major role in reducing the impact of experimental errors on the final extracted value of m,,. 
To get a realistic determination of these errors it is important to separate correlated and uncorrelated errors, and also 
to take into account the strong correlations between the spectral integrals involving different weights. 

The nature of the longitudinal subtraction differs significantly in the low-s and high-s (>^ 1 GeV) regions. For low 
s, the TT and K pole subtractions are experimentally unambiguous. For high s (the resonance region), the longitudinal 
contributions are proportional to (tos ± to„)^, (md ± muf, for us, ud, respectively, and hence dominated by the us 
contributions. The longitudinal us vector contribution is inferred from the strange scalar spectral function of Ref. [5]. 
This procedure is consistent provided the value of rus resulting from the present analysis is compatible with that from 
the strange scalar channel [9], which it turns out to be. The longitudinal us axial vector contribution is similarly 
inferred from the spectral function of the strange pseudoscalar channel. The latter is obtained by fixing the excited 
resonance decay constants of a sum-of-resonances spectral ansatz through matching of the hadronic and OPE sides 
of a family of "pinch- weighted" FESR's, in analogy to the analysis of Ref [34].!" The input value of m,s required for 
this analysis should, in principle, be determined iteratively. We have, however, employed as input the value of rrig 
obtained from the strange scalar analysis of Ref. [9], ms(l GeV) = 159 ± 11 MeV. This turns out to be consistent 
with our final result for TOg. Moreover, for the steeply- falling weights employed in our analysis, the sum of the high-s 



®The 1998 tabulation of the nonstrange data receives a small overall normalization correction as a result of the shift in i?"" 
between the preliminary 1998 and final 1999 analyses. We thank Shaomin Chen for bringing this point to our attention. 

i°The corresponding procedure works very well in the isovector vector channel, where the results can be checked against the 
well-known experimental spectral function [34]. A similar statement is true even in channels with strongly attractive interactions 
near threshold, for which the spectral function will bo poorly represented near threshold by the tail of a Broit-Wignor resonance 
form with "conventional" s-dcpcndcnt width. For example, using the value of rris obtained from the strange scalar channel 
analysis as input and redoing the strange scalar channel analysis, using now a sum-of-resonances spectral ansatz in place of the 
more realistic ansatz of Ref. [5], one finds that the ansatz of Ref. [5] is well-reproduced in the region of the dominant (1430) 
peak. One can also use this approach to check the self-consistency between the assumed longitudinal contributions and the 
output ms value in kinomatic-weight-bascd analysis of Ref. [13,15]. It turns out that the high-s longitudinal contributions 
assumed are more than a factor of 2 smaller than would be expected based on the extracted value of ms- If one employs the 
PDG98 values for |V„s| and //c , as discussed above, however, the assumed longitudinal contribution becomes compatible within 
the errors assigned to it in Ref. [13,15]. 
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V and A longitudinal subtractions is at the < 0.1% level of the us spectral integral, and hence at the < 1% level in 
the ud-us difference. As such, even were our evaluation to be in error by 100%, the effect on mg would be completely 
negligible on the scale of the other errors present in the analysis. 

On the OPE side, we retain contributions up to and including D = 8. The leading D = 2 term was given above. 

The D = A contribution is [29,10] 

W)] iD=,) = ^ [(^^ < * > -Is) (l - a{Q') - y a(g2)2^ 

+ ArntiQ')(z7L-^)], (17) 



77r2 ' \a{Q-^) 12 

where Ig is the visual RG invariant modification of the non-normal-order strange quark condensate [35], m£ is the 
average of the light u, d masses, and < > is the light (u, d) condensate. We use the quark mass ratios determined 
from the ChPT analyses of Ref. [36], the GMO relation 2m^ < II >= —f^m'^, and the range of values 0.7 < 
{ss)/{U) < 1 [2,3] for the ratio of condensates. The contour integrals are performed as described below. 

For the D = 6 contribution we employ a rescaled version of the vacuum saturation approximation (VSA). From 
the results of Ref. [29], one finds 

[^m (.=6) = ^ [< ^"^ >'-< - >1 ' (18) 

where p represents a multiplicative rescaling of the VSA estimate. The analogous rescaling has been determined 

empirically for the isovcctor vector channel and the isospin-breaking vector 38 correlator, and found to be ^ 5 in 
both cases [37,22]. For the weights employed in our analysis, it turns out that the integrated D = 6 contributions 
are very small. We are, therefore, able to employ the very conservative estimate p = 5 ± 5 for the degree of VSA 
violation without significantly affecting the overall theoretical error. The combination pUg < qq >^ in Eq. (18) is to 
be understood as an eflfective RG-invariant combination for the evaluation of the OPE contour integrals. 
Finally, for the D = 8 contribution, we assume 

[n(Q')]p=8) = §- (19) 

For wiQ this term does not contribute to the integrated OPE; for W2q and wio, the value of the effective RG-invariant 
condensate combination, Cs, is to be determined as part of the analysis. 

As noted above, the OPE contour integrals (for all D) are performed using the contour improvement prescription. 
Four-loop versions of the running mass and coupling are employed. To be specific, we have solved analytically for 
the running mass and coupling using the 4-loop truncated versions of the (3 [38] and 7 [39] functions, with the value 
determined in nonstrange hadronic r decays, as(m^) = 0.334 ± 0.022 [23], as input. Following conventional practice, 
we take the error associated with the truncation of the perturbative series for the Wilson coefficient of the D = 2 
term at 0{a'^) to be equal to the value of the last {0{a^)) contribution retained. In light of the discussion above we 
consider this to represent an extremely conservative estimate. 

From the point of view of uncertainties on the OPE side, the Wiq sum rule is favored over the Wio and 11)20 sum 
rules for three reasons: (1) it has no D = 8, 10 contributions, (2) it has the smallest truncation error, and (3) it has 
the smallest errors associated with uncertainties in the input values of the D = 4 and D = 6 condensates.^^ In Table 
III we display, as a function of sq, the extracted values of TOs(1 GeV^) obtained from the wio sum rule, analyzed 
neglecting contributions of dimension 12 and higher. Central values have been used for all input on the OPE side 
and for the experimental spectral data. For the analysis to be self-consistent, the extracted value of m,, should be 
independent of sq. This will be true for sq sufficiently large that the _D > 12 contributions arc negligible. As sq is 
decreased, the extracted nig values should eventually deviate from a constant, signalling the growth of the higher 
dimension terms. From the Table we see that the range 2.75 GeV^ < so < 3.15 GeV^ provides an extremely good 
window of stability. In view of the falloff begining around sq ~ 2.55 GeV^, we will work in the range sq > 2.55 GeV^ 
in the discussions which follow. It is worth stressing that the central values obtained from W20 and wiq sum rules, 
though having slightly larger theoretical errors, are nonetheless completely consistent with those above: in the window 



Combining the errors associated with truncation, the condensate input values, and the uncertainty on as{m^) in quadrature, 
the resulting errors on rus are 7.7%, 8.2% and 8.4% for wio, wio and W20, respectively. 
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2.55 GcV^ < So < 3.15 GeV^ one finds that the range of solutions for ms{l GeV^) lies between 156 and 161 MeV for 
11)20, 158 and 164 MeV for wio, and, as we saw already in Table III, 159 and 163 MeV for wiq. In contrast, the wl+t 
sum rule, for which the longitudinal subtraction is important, and the D = 2 convergence is not well under control, 
yields a range between 161 and 184 (with, moreover, inconsistent solutions for Cg)- 

From the point of view of the impact of the errors present in existing experimental data, the theoretically favored 
wiQ weight is, unfortunately, no longer the favored one. The reason is that, although the impact of the errors in 
the high-s region of the us spectrum has been strongly suppressed by the rapid falloff of the weights employed, the 
ud-us cancellation is still rather close {e.g., at sq = to^, to the level of 6.0% for wio, 6.8% for ivio and 8.6% for 
W20, to be compared with 3.7%, 6.5% and 9.3% for the w^^rp, N ~ 0,1,2.) Although the dominant errors (those 
from the K* region of the us spectrum) are reasonably small, they are still large enough that the relative size of the 
residual statistical error grows very rapidly with the increase in the degree of cancellation. Thus, e.g., at sq = rn^, 
the statistical error represents 42%, 36%, 26%, 77%, 38% and 23% of the ud-us spectral difference for the wiq, wiq, 
W20, li'i+T' ^L+T' ""^i+T sum rulcs, respectively.^^ The present experimental situation is, therefore, such that 
the errors on our final result for rris are minimized by working with W20, rather than wio- 

Working with the W20 sum rule in the window specified above we find, for our best fit, 

ms{l GeV^) = 158.6 ± 18.7 ± 16.3 ± 13.3 MeV , (20) 

which is equivalent to 

m^(4 GeV^) = 115.1 ± 13.6 ± 11.8 ± 9.7 MeV , (21) 

where in both of Eqs. (20) and (21) the first error is statistical, the second is due to the uncertainty on \Vua\, and 
the third theoretical. The theoretical error has been obtained by combining the following in quadrature (where we 
quote the numerical values corresponding to Eq. (20) to be specific): ±5.2 MeV, associated with the error on as{m'^); 
±3.6 MeV, associated with the uncertainty in < ss > / < U >; ±1.6 MeV, associated with the variation of rris within 
the window 2.55 GeV^ < sq < m^; ±0.6 MeV, associated with the uncertainty in the VSA-violating parameter, p; 
and ±11.6 MeV, associated with truncation of the D = 2 series. The latter obviously remains the dominant source 
of theoretical error, despite the significant improvement produced by the use of the new weights. Figure 2 displays 
the quality of the match between the OPE and spectral integral sides of the W20 sum rule corresponding to the fit 
above; the agreement in the previously-established stability window, Sq > 2.55 GeV^, is obviously excellent. The 
divergence of the OPE and spectral integral curves below sq ^ 2.55 GeV^ is precisely what one would expect based 
on the observation above that, for the w;io sum rule, D > 10 contributions, not included in the truncated OPE 
representation, begin to become important in this region. 

The result of Eqs. (20) and (21) is in good agreement with the strange scalar channel results of Refs. [5] and [9], 
the strange pseudoscalar channel result of Ref. [8], and the recent hadronic r decay analysis of Ref. [14], but, we 
believe, has signficantly reduced theoretical and experimental errors. In particular, the statistical error has, at this 
point, been reduced almost to the level of that associated with the uncertainty in \Vus\. 

Improvements in the accuracy of the experimental us spectral data, in particular in the K* region, could lead to a 
significant improvement in the size of the statistical error. Such an improvement should be possible using BaBar data 
[40]. Reduced uncertainties in our knowledge of \Vus\ would also be helpful. On the theoretical side, while significant 
improvements in the accuracy of the spectral data would allow one to move from the W20 to the Wiq sum rule, the 
decrease in the theoretical uncertainty that would result from this shift would be only 1.3 MeV. Far more likely to 
lead to a significant improvement in the size of the theoretical error would be a computation of the 0{a^) coefficient 
in the D = 2 contribution to the fiavor-breaking correlator difference, 11. 
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to 19% when so is lowered from to 2.55 GeV^. 
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TABLE I. OPE convergence of the "contour improved" D = 2 contributions, QkAj. ^"""^ (m?), as a function of the contour 

improved order, fc, for the spectral weights, 'w^_^j,{y) = (1 — y)^^^{l + 2y), assuming geometric growth of coefficients beyond 
0{as). All entries have been rcscaled by the corresponding entry for fc = 0. 



Weight 


fc = 


k = 1 


fc = 2 


fc = 3 


fc = 4 


fc = 5 


fc = 6 


fc = 7 


fc = 8 


fc = 9 


fc = 10 


■wl+T 


1 


0.143 


-0.007 


-0.145 


-0.237 


-0.286 


-0.294 


-0.272 


-0.233 


-0.187 


-0.141 


wi+T 


1 


0.209 


0.100 


-0.027 


-0.143 


-0.232 


-0.287 


-0.308 


-0.300 


-0.272 


-0.233 


wl+T 


1 


0.257 


0.187 


0.076 


-0.048 


-0.143 


-0.260 


-0.324 


-0.357 


-0.359 


-0.339 



TABLE II. OPE convergence of the "contour improved" D = 2 contributions, gkA^^'{m^), as a function of the contour 
improved order, k, for the weights, wio, wio, and W20, assuming geometric growth of coefficients beyond 0{ai). All entries 
have been rescaled by the corresponding entry for A; = 0. 



Weight fc = 


fc = 1 fc = 2 


fc = 3 


fc = 4 


fc = 5 


fc = 6 


fc = 7 


fc = 8 


fc = 9 


fc = 10 


W20 1 


0.262 0.213 


0.143 


0.073 


0.018 


-0.017 


-0.033 


-0.034 


-0.027 


-0.016 


wio 1 


0.232 0.165 


0.092 


0.032 


-0.008 


-0.030 


-0.038 


-0.038 


-0.035 


-0.032 


Wio 1 


0.248 0.193 


0.125 


0.064 


0.019 


-0.009 


-0.023 


-0.026 


-0.024 


-0.020 


TABLE III. The extracted value of m, 


.(1 GeV^) 


in MeV 


as a function of so 


for the weight wio having no D 


= 8,10 


contributions. 




















so (GeV^): 


2.35 




2.55 




2.75 




2.95 




3.15 


m,(l GeV^) (MeV): 


153.2 




159.0 




162.2 




163.4 




163.2 
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FIG. 2. The agreement between the OPE and hadronic sides of the FESR corresponding to the weight, W2o{y) for 
1.95 GeV^ < So < m?. The sohd hne is the OPE side, using the values of m,a and Ca obtained in the fitting procedure 
described in the text. The dashed line is the hadronic side, obtained using the ALEPH spectral data from which the longitu- 
dinal component has been subtracted as described in the text. 
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